Applied AI Example: Literature Searching & Automating Reference Manager with Claude

Jun 13

By:
Alexander Scheer, CISSP, CISM, AAISM, QTE 501
Mantle AI, Inc.

In this post, I’ll walk through two related workflows:

Using Claude and MCP to make literature search less painful, especially when working across open-access sources.
Using Claude Code with a reference manager to maintain, update, and extract more value from a research corpus. In my case, that reference manager is Zotero, though the same pattern should translate to other tools. There are also open-source MCP projects for Mendeley and EndNote; I couldn’t find one for ReadCube. The configuration will differ, but the underlying concept is the same.

Where this is relevant:

This is most useful in disciplines where research is a core part of the job and where the volume of material starts to benefit from LLM and agentic capabilities: synthesis, abstraction across APIs and sources, pattern identification, and the ability to reason across messy data sets.

For years, I’ve been frustrated by the inefficiency of literature searching: finding papers, downloading them, tagging them, keeping them sorted in my reference manager, and then later going back to figure out what I saved, why I saved it, and which papers are actually relevant to a particular topic.

My search process has also always been frustrating. The results are inconsistent, and I often do not find what I’m looking for because, in part, I can’t describe my inquiry in natural language. Or, more accurately, I can describe it in natural language — just not in the exact combination of terms, Boolean operators, and database-specific query magic required to obtain a useful result.

On one hand, this is almost certainly a “me” problem.

On the other hand, the beauty of AI is that it lets you spend more time on the meta-layer: the goal you’re trying to achieve, the problem you are trying to solve, the question you’re trying to answer, and the output you would like. This translates to spending less time on non-differentiated activities like fine-tuning a laborious workflow, wrangling custom Python scripts, and convincing myself that my time is being time spent wisely on this activities.

I’ve been experimenting with two tools that seem to augment my specific working pattern, with Zotero serving as my reference manager:

MCP for Zotero
paper-search-mcp

Having now combined them with a simple workflow in Claude, I’m starting to see real efficiencies — and, more interestingly, a new way to breathe life into the research corpora I’ve been building in Zotero: the pile of papers, notes, websites, PDFs, videos, and other “I should come back to this later” material.

I know this will have utility for others, particularly in fields like biotech, where we almost exclusively work.

So, this is a writeup of how the workflow works for me, including how it enabled an actual literature search inquiry I had and helped me get the discovered content into my reference manager.

  
  My Prompt
  “I’m looking for papers related to the latest applied thinking on ICH E6 R3 as a clinical-trial Sponsor with a consortium of third-party partners — CROs, clinical sites, regulatory consultants, and the like. To the extent possible, find papers covering wearables. I’m looking for at least 5 papers.”

The tools

Both of these connect to Claude through the Model Context Protocol (“MCP”), which is an open standard that lets an AI assistant reach outside tools and data without anyone having to mess with each connection through progammatic interfaces ("APIs"). The built-in connectors in Claude all use MCP.

The first, paper-search-mcp, is an open-source project built on a free-first principle, which is to say it leans on open and public sources before it reaches for anything paid.

What is great about it is that I can give it a single query in natural language, and it fans that query out across more than twenty open scholarly sources at once, among them arXiv, PubMed, Crossref, OpenAlex, Europe PMC, bioRxiv, and Semantic Scholar, then it deduplicates whatever comes back and pulls the open-access full text wherever a lawful link exists.

Note: See the end of this blog for the matrix from the paper-search-MCP repository on what’s supported.

The second, the Zotero connector, is a hosted proxy that sits in front of the Zotero API, and by the author's own description it holds nothing beyond encrypted credentials. Once it is connected, Claude can search my library, drop items into a specific collection, tag them, write notes against them, and pull citations back out.

This is where I started to see additional value once the paper searching had completed. I feel it really has unlocked a new dimension to the data in my reference manager, including how to get information in there, tagged, and annotated.

The sample workflow

The workflow around the two tools is simple...

Claude Desktop: Write prompt geared around your paper search inquiry. I recommend turning off other MCP connectors and only leaving the paper-search-mcp connector enabled as Claude tends to try other tools when hitting roadblocks.
Claude Code: Copy the table into Code and prompt it to “verify links/DOI references, download, import into [name of] collection, tag, and annotate with notes”.

Now, I recognize this could be integrated through Claude Cowork, bridging both connectors, but I personally prefer the terminal a lot of my AI work, so that is a style choice I’ve made. Conversely, I could also exclusively use Claude Code for the paper-search-mcp component, but I was interested in trying it in the Claude Desktop console. There is no single answer in this AI landscape - it is truly about experimenting and following what works for you.

That table then goes into Claude Code, and from there the Zotero connector does the admin part:
a) verify DOIs with WebFetch tool
a)pulling each paper
b) filing it into the right collection
c) tagging it
d) writing notes into the entries

Example below from Claude Code after it imported all items into my GCP collection:

And this is the data from the Zotero perspective:

What else the Zotero connector is capable of?

Core Capabilities

Search Your Zotero Library

MCP for Zotero can search across your Zotero library using natural-language requests.

It can help find items by:

Title
Author
Topic
Year
Tag
Collection
PDF content, where indexed by Zotero Desktop

Add New References

The tool can add new items to Zotero through natural-language descriptions.

Instead of manually entering metadata, the assistant can help look up and create records for:

Journal articles
Books
Theses
Papers
Other reference types supported by Zotero

It can also help with DOI lookup and metadata population.

Organize Collections

MCP for Zotero can help create and manage Zotero collections.

It can support workflows such as:

Creating a new collection
Adding items to a collection
Moving papers into a specific research folder
Organizing references around a topic or project

Manage Tags

The tool can apply and update tags on Zotero items.

This is useful for building more structured research libraries around themes, source types, or review status.

Update Metadata

MCP for Zotero can help update item metadata in Zotero.

This can include cleaning up or enriching records with better information, such as:

Title
Author
Publication year
DOI
Item type
Publication venue
Tags
Notes

This is particularly useful when references are imported with incomplete or messy metadata.

Export Citations and Bibliographies

The tool can export citations and bibliographies from Zotero.

Supported export examples include:

BibTeX
RIS
APA
Chicago
Other formatted citation styles supported through Zotero

Work with Group Libraries

MCP for Zotero can access Zotero group libraries in addition to a personal library.

This is useful for collaborative research workflows where a shared Zotero library is used by a team, lab, company, or project group.

Search Inside PDFs

Where Zotero Desktop has indexed PDF content, the tool can search inside PDFs.

This makes it possible to ask for papers based not only on citation metadata, but also on concepts or language contained in the actual paper text.

APPENDIX A - Getting Started

Zotero MCP Connector

Go to the creator’s site: https://mcpforzotero.alejandroarnaud.dev/
Sign-up with Google OAuth (asks for basic profile information). If you find you don’t want to continue using tools, you can clean-up the OAuth authorization in your Google account here: https://myaccount.google.com/connections
Create the Zotero token: https://www.zotero.org/settings/security#applications. Click on “Create a new private key”
Add the token to the configuration screen on mcpforzotero.com: “Connect your Zotero library”
Create the access tokens for Claude - one for Desktop, one for Code: “Access Tokens”
Copy the generated configurations below the “Access Tokens” section for your applicable terminal - Claude Desktop, Code, ChatGPT
For Claude Desktop, you can find the configuration file here:
OSX: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
For Claude Code, you can find the configuration file here:
~/.claude.json
or, at your project root for a given project: mcp.json
Close Claude and re-open it

Paper Search MCP

Load as a Claude Code skill:
Step 1 — Clone the repo:
```
git clone https://github.com/openags/paper-search-mcp.git ~/paper-search-mcp
```
Step 2 — Install the skill:
```
mkdir -p ~/.claude/skills/paper-search
cp ~/paper-search-mcp/claude-code/SKILL.md ~/.claude/skills/paper-search/SKILL.md
```
Step 3 — Update the repo path in the skill:
Edit ~/.claude/skills/paper-search/SKILL.md and replace every <REPO_PATH> with the absolute path to your clone (e.g. /Users/yourname/paper-search-mcp).
Install Using Smithery and Node
```
npx -y @smithery/cli install @openags/paper-search-mcp --client claude
```
Smithery automatically writes the correct config block for you. No manual JSON editing needed.

APPENDIX B - Paper Search MCP

Platform Capability Matrix

This matrix reflects verified live-integration results from functional and end-to-end regression tests in this repository. Columns show the highest capability level observed under normal conditions.

Platform	Search	Download	Read	Notes
arXiv	✅	✅	✅	Open API; reliable
PubMed	✅	❌	⚠️ info-only	Open API; reliable
bioRxiv	✅	✅	✅	Open API; reliable
medRxiv	✅	✅	✅	Open API; reliable
Google Scholar	⚠️	❌	❌	Bot-detection active; set `PAPER_SEARCH_MCP_GOOGLE_SCHOLAR_PROXY_URL`
IACR	✅	✅	✅	Open API; reliable
Semantic Scholar	✅	✅ (OA)	✅ (OA)	Works without key (rate-limited); key improves limits; key rejection (403) retried automatically without key
Crossref	✅	❌	⚠️ info-only	Open API; reliable
OpenAlex	✅	❌	⚠️ info-only	Open API; reliable
PMC	✅	✅ (OA only)	✅ (OA only)	OA PDFs only; direct download may be blocked by some proxy environments
CORE	✅	✅ (record-dependent)	✅ (record-dependent)	Free key recommended; connector retries with backoff and falls back to key-less on 401/403
Europe PMC	✅	✅ (OA)	✅ (OA)	OA PDFs only; direct download may be blocked by some proxy environments
dblp	✅	❌	⚠️ info-only	Open API; reliable
OpenAIRE	✅	❌	❌	Open API; retries 3× with escalating request profiles on transient 403
CiteSeerX	⚠️	✅ (record-dependent)	⚠️	API endpoint intermittently unavailable / redirects to web archive
DOAJ	✅	⚠️ (URL-dependent)	⚠️ (URL-dependent)	PDF availability varies by article; free key raises rate limits
BASE	⚠️	✅ (record-dependent)	✅ (record-dependent)	OAI-PMH endpoint requires institutional IP registration; returns empty gracefully otherwise
Zenodo	✅	✅ (record-dependent)	✅ (record-dependent)	Open API; reliable
HAL	✅	✅ (record-dependent)	✅ (record-dependent)	Open API; reliable
SSRN	⚠️	⚠️ best-effort	⚠️ best-effort	403 bot-detection active; public PDF only
Unpaywall	✅ (DOI lookup)	❌	❌	Requires `PAPER_SEARCH_MCP_UNPAYWALL_EMAIL`
Sci-Hub (optional)	⚠️ fallback-only	✅	❌	Optional; unstable mirrors; user responsibility
IEEE Xplore 🔑	🚧 skeleton	🚧 skeleton	🚧 skeleton	Requires `PAPER_SEARCH_MCP_IEEE_API_KEY` to activate
ACM DL 🔑	🚧 skeleton	🚧 skeleton	🚧 skeleton	Requires `PAPER_SEARCH_MCP_ACM_API_KEY` to activate

✅ = reliable in live tests. ⚠️ = works but subject to upstream instability or access restrictions. ❌ = not supported. 🔑 = key required. 🚧 = skeleton only.

Credential & API Key Requirements

All keys are optional unless noted. Configure them in .env (preferred) or as shell exports.

Environment Variable	Provider	Required?	How to obtain
`PAPER_SEARCH_MCP_UNPAYWALL_EMAIL`	Unpaywall	Yes (Unpaywall disabled without it)	Any valid email; register at unpaywall.org
`PAPER_SEARCH_MCP_CORE_API_KEY`	CORE	Recommended	Free at core.ac.uk/services/api
`PAPER_SEARCH_MCP_SEMANTIC_SCHOLAR_API_KEY`	Semantic Scholar	Optional	Free at semanticscholar.org — improves rate limits
`PAPER_SEARCH_MCP_GOOGLE_SCHOLAR_PROXY_URL`	Google Scholar	Optional	Your HTTP/HTTPS proxy URL — bypasses bot-detection
`PAPER_SEARCH_MCP_DOAJ_API_KEY`	DOAJ	Optional	Free at doaj.org — raises hourly rate limit
`PAPER_SEARCH_MCP_ZENODO_ACCESS_TOKEN`	Zenodo	Optional	Free at zenodo.org — required for private records
`PAPER_SEARCH_MCP_IEEE_API_KEY`	IEEE Xplore	Required to activate	Free at developer.ieee.org
`PAPER_SEARCH_MCP_ACM_API_KEY`	ACM DL	Required to activate	See libraries.acm.org/digital-library/acm-open

All variables follow the PAPER_SEARCH_MCP_<NAME> prefix scheme. Legacy names without the prefix (e.g. CORE_API_KEY, UNPAYWALL_EMAIL) are still supported for backward compatibility.

Known Upstream Limitations

Some search failures are caused by external provider instability, not by bugs in this project:

Source	Symptom	Cause	Workaround
Google Scholar	Returns 0 results / empty HTML	Bot-detection (CAPTCHA)	Set `PAPER_SEARCH_MCP_GOOGLE_SCHOLAR_PROXY_URL` to a proxy
Semantic Scholar	429 rate-limited responses	Anonymous access rate limit	Set `PAPER_SEARCH_MCP_SEMANTIC_SCHOLAR_API_KEY`; if key is rejected (403) connector automatically retries without key
CORE	500 / timeout errors	Unauthenticated rate limiting	Set `PAPER_SEARCH_MCP_CORE_API_KEY` (free); connector retries with exponential backoff and falls back to key-less on 401/403
OpenAIRE	Transient 403 responses	IP-based session rate limiting	Connector retries 3× per profile, escalating: plain session → XML Accept header → raw `requests.get` with Mozilla UA
CiteSeerX	404 via web archive redirect	PSU endpoint intermittently redirects to archive	No workaround; connector returns empty gracefully
BASE	Search returns 0 results	OAI-PMH endpoint requires institutional IP registration	Register at base-search.net for API access; connector returns empty gracefully otherwise
SSRN	HTTP 403	Bot-detection (Cloudflare)	No workaround; connector tries two endpoints and returns a clear message on failure
PMC / Europe PMC	PDF download ProxyError	Local proxy blocking direct HTTPS PDF download	Disable proxy or use `download_with_fallback` instead
Unpaywall	Skipped entirely	`UNPAYWALL_EMAIL` env var not set	Set `PAPER_SEARCH_MCP_UNPAYWALL_EMAIL` in `.env`

Optional Paid Platform Connectors (Phase 3)

IEEE Xplore and ACM Digital Library connectors are included as opt-in skeletons. They are disabled by default — no API calls are made unless you explicitly configure the corresponding keys.

Platform	Env Var	Status
IEEE Xplore	`PAPER_SEARCH_MCP_IEEE_API_KEY`	🚧 skeleton — search registered, download/read raise `NotImplementedError`
ACM Digital Library	`PAPER_SEARCH_MCP_ACM_API_KEY`	🚧 skeleton — search registered, download/read raise `NotImplementedError`

Alexander Scheer